Picture for Yuhang Jiang

Yuhang Jiang

Detection vs. Execution: Single-Bucket Probes Miss Half the Mamba-2 State Sink

Add code
May 30, 2026
Viaarxiv icon

Task Structure Reverses Layerwise State Encoding in Sequence Models

Add code
May 30, 2026
Viaarxiv icon

PInVerify: An Offline Embodied Benchmark for Active Instance Verification

Add code
May 28, 2026
Viaarxiv icon

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

Add code
May 07, 2026
Viaarxiv icon

Closing Reasoning Gaps in Clinical Agents with Differential Reasoning Learning

Add code
Feb 10, 2026
Viaarxiv icon

Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models

Add code
Feb 02, 2026
Viaarxiv icon

Unsupervised Data Generation for Offline Reinforcement Learning: A Perspective from Model

Add code
Jun 24, 2025
Viaarxiv icon

A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models

Add code
Apr 05, 2025
Figure 1 for A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models
Figure 2 for A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models
Figure 3 for A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models
Figure 4 for A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models
Viaarxiv icon

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Add code
Dec 15, 2024
Figure 1 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Figure 2 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Figure 3 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Figure 4 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Viaarxiv icon

Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment

Add code
Nov 19, 2024
Figure 1 for Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment
Figure 2 for Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment
Figure 3 for Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment
Figure 4 for Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment
Viaarxiv icon